Boosting universal speech attributes classification with deep neural network for foreign accent characterization
نویسندگان
چکیده
We have recently proposed a universal acoustic characterisation to foreign accent recognition, in which any spoken foreign accent was described in terms of a common set of fundamental speech attributes. Although experimental evidence demonstrated the feasibility of our approach, we belive that speech attributes, namely manner and place of articulation, can be better modelled by a deep neural network. In this work, we propose the use of deep neural network trained on telephone bandwidth material from different languages to improve the proposed universal acoustic characterisation. We demonstrate that deeper neural architectures enhance the attribute classification accuracy. Furthermore, we show that improvements in attribute classification carry over to foreign accent recognition by producing a 21% relative improvement over previous baseline on spoken Finnish, and a 5.8% relative improvement on spoken English.
منابع مشابه
Speech Attribute Detection Using Deep Learning
In this work we present alternative models for attribute speech feature extraction based on the two state-of-the-art deep neural networks: convolutional neural networks (CNN) and feed-forward neural network with pretraining using stack of restricted Boltzmann machines (DBN-DNN). These attribute detectors are trained using data-driven approach across all languages in the OGI-TS multi-language te...
متن کاملAccent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features
Automatic identification of foreign accents is valuable for many speech systems, such as speech recognition, speaker identification, voice conversion, etc. The INTERSPEECH 2016 Native Language Sub-Challenge is to identify the native languages of non-native English speakers from eleven countries. Since differences in accent are due to both prosodic and articulation characteristics, a combination...
متن کاملExploiting deep neural networks for detection-based speech recognition
In recent years deep neural networks (DNNs) – multilayer perceptrons (MLPs) with many hidden layers – have been successfully applied to several speech tasks, i.e., phoneme recognition, out of vocabulary word detection, confidence measure, etc. In this paper, we show that DNNs can be used to boost the classification accuracy of basic speech units, such as phonetic attributes (phonological featur...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملAutomatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks
This paper investigates the use of multi-distribution deep neural networks (MD-DNNs) for automatic lexical stress detection and pitch accent detection, which are useful for suprasegmental mispronunciation detection and diagnosis in second-language (L2) English speech. The features used in this paper cover syllable-based prosodic features (including maximum syllable loudness, syllable nucleus du...
متن کامل